Optimum Loudness for Obtaining Maximum Speech Recognition Score.
نویسندگان
چکیده
منابع مشابه
Maximum likelihood normalization for robust speech recognition
It is well-known that additive and channel noise cause shift and scaling in MFCC features. Empirical normalization techniques to estimate and compensate for the effects, such as cepstral mean subtraction and variance normalization, have been shown to be useful. However, these empirical estimate may not be optimal. In this paper, we approach the problem from two directions, 1) use a more robust ...
متن کاملObtaining equal loudness contours from Weber fractions
An empirical equation from Riesz's classic study on difference thresholds is treated in a new manner. Reformulating the expression for the Weber fraction allows one to account for the shape of the loudness function at different frequencies. Furthermore, the emerging loudness function unifies both the commonly used power and logarithmic laws of sensation. In principle, equal loudness contours ca...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملParallel tone score association method for tone language speech recognition
Tone is an essential component for word formation in all tone languages. Substantial work has been done on using tone information to improve speech recognition of tone languages. In this paper, a new method, called Parallel Tone Score Association (PTSA), for effectively and efficiently using tone in speech recognition is proposed. Experimental results show that the relative character error rate...
متن کاملA cepstral domain maximum likelihod beamformer for speech recognition
Recent work by Seltzer [1] indicates that classical approaches to beamforming, minimizing output power while enforcing a distortionless constraint, do not yield optimal results in terms of word error rate (WER) on speech recognition task. This problem can be traced back to the mismatch between the target criterion of classical adaptive beamformers, which is optimization of the signal to noise r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: AUDIOLOGY JAPAN
سال: 2001
ISSN: 1883-7301,0303-8106
DOI: 10.4295/audiology.44.114